Modeling load imbalance and fuzzy barriers for scalable shared-memory multiprocessors

نویسندگان

  • Alexandre E. Eichenberger
  • Santosh G. Abraham
چکیده

We propose an analytical model that quantifies the overall execution time of a parallel region in the presence of non-deterministic load imbalance introduced by network contention and by random replacement policy in processor caches. We present a novel model that evaluates the expected hit ratio and variance introduced by a cache accessed with a cyclic access stream. We also model the performance improvement of fuzzy barriers, where the synchronization between processors at the end of a parallel region is relaxed. Experiments on a &-processor KSR system which has random firstlevel caches confirms the general nature of the analytic

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Impact of Load Imbalance on the Design of Software Barriers

Software barriers have been designed and evaluated for barrier synchronization in large-scale shared-memory multiprocessors, under the assumption that all processors reach the synchronization point simultaneously. When relaxing this assumption, we demonstrate that the optimum degree of combining trees is not four as previously thought but increases from four to as much as 128 in a 4K system as ...

متن کامل

Modeling and Performance Evaluation of Multi-Processors Organization with Shared Memories

This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.

متن کامل

Scalable Atomic Primitives for Distributed Shared Memory Multiprocessors

Our research addresses the general topic of atomic update of shared data structures on large-scale shared-memory multiprocessors. In this paper we consider alternative implementations of the general-purpose single-address atomic primitives fetch and , compare and swap, load linked, and store conditional. These primitives have proven popular on small-scale bus-based machines, but have yet to bec...

متن کامل

Computation and Data Partitioning on Scalable Shared Memory Multiprocessors

In this paper we identify the factors that affect the derivation of computation and data partitions on scalable shared memory multiprocessors (SSMMs). We show that these factors necessitate an SSMM-conscious approach. In addition to remote memory access, which is the sole factor on distributed memory multiprocessors, cache affinity, memory contention and false sharing are important factors that...

متن کامل

Bidirectional Ring: An Alternative to the Hierarchy of Unidirectional Rings

A hierarchy of unidirectional rings has been used successfully in distributed shared-memory multiprocessors. The xed cluster size of the hierarchy prevents full exploitation of communication locality. The bidirectional ring is presented as an alternative to the hierarchy. Its relative performance is evaluated for a variety of memory access patterns and network sizes. It gives superior performan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995